Overview

Dataset statistics

Number of variables20
Number of observations4027
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory601.8 KiB
Average record size in memory153.0 B

Variable types

Text3
Numeric4
Categorical11
DateTime1
Boolean1

Alerts

card_number is highly overall correlated with customer_segment and 3 other fieldsHigh correlation
trans_amount is highly overall correlated with fakeHigh correlation
customer_segment is highly overall correlated with card_numberHigh correlation
card_type is highly overall correlated with card_numberHigh correlation
customer_location is highly overall correlated with card_number and 1 other fieldsHigh correlation
merchant_name is highly overall correlated with fakeHigh correlation
trans_loc is highly overall correlated with card_number and 1 other fieldsHigh correlation
trans_currency is highly overall correlated with fakeHigh correlation
fake is highly overall correlated with trans_amount and 2 other fieldsHigh correlation
trans_currency is highly imbalanced (59.0%)Imbalance
trans_id has unique valuesUnique
trans_approval_code has unique valuesUnique

Reproduction

Analysis started2023-09-09 00:46:25.227417
Analysis finished2023-09-09 00:48:45.085118
Duration2 minutes and 19.86 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct992
Distinct (%)24.6%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
2023-09-09T06:18:45.748775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length22
Mean length13.31711
Min length8

Characters and Unicode

Total characters53628
Distinct characters52
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKaitlin Edwards
2nd rowKaitlin Edwards
3rd rowKyle Armstrong
4th rowKyle Armstrong
5th rowKyle Armstrong
ValueCountFrequency (%)
smith 134
 
1.6%
david 91
 
1.1%
brown 80
 
1.0%
james 77
 
0.9%
christopher 77
 
0.9%
michael 74
 
0.9%
jennifer 71
 
0.9%
robert 65
 
0.8%
williams 64
 
0.8%
john 63
 
0.8%
Other values (805) 7432
90.3%
2023-09-09T06:18:46.877683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5059
 
9.4%
e 4870
 
9.1%
4201
 
7.8%
r 3902
 
7.3%
n 3849
 
7.2%
i 3356
 
6.3%
o 2947
 
5.5%
l 2728
 
5.1%
s 2392
 
4.5%
t 1782
 
3.3%
Other values (42) 18542
34.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40985
76.4%
Uppercase Letter 8374
 
15.6%
Space Separator 4201
 
7.8%
Other Punctuation 68
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5059
12.3%
e 4870
11.9%
r 3902
9.5%
n 3849
9.4%
i 3356
 
8.2%
o 2947
 
7.2%
l 2728
 
6.7%
s 2392
 
5.8%
t 1782
 
4.3%
h 1710
 
4.2%
Other values (16) 8390
20.5%
Uppercase Letter
ValueCountFrequency (%)
M 863
 
10.3%
J 819
 
9.8%
A 615
 
7.3%
C 593
 
7.1%
D 582
 
7.0%
S 573
 
6.8%
B 570
 
6.8%
R 556
 
6.6%
W 427
 
5.1%
H 388
 
4.6%
Other values (14) 2388
28.5%
Space Separator
ValueCountFrequency (%)
4201
100.0%
Other Punctuation
ValueCountFrequency (%)
. 68
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 49359
92.0%
Common 4269
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5059
 
10.2%
e 4870
 
9.9%
r 3902
 
7.9%
n 3849
 
7.8%
i 3356
 
6.8%
o 2947
 
6.0%
l 2728
 
5.5%
s 2392
 
4.8%
t 1782
 
3.6%
h 1710
 
3.5%
Other values (40) 16764
34.0%
Common
ValueCountFrequency (%)
4201
98.4%
. 68
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 53628
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5059
 
9.4%
e 4870
 
9.1%
4201
 
7.8%
r 3902
 
7.3%
n 3849
 
7.2%
i 3356
 
6.3%
o 2947
 
5.5%
l 2728
 
5.1%
s 2392
 
4.5%
t 1782
 
3.3%
Other values (42) 18542
34.6%

card_number
Real number (ℝ)

HIGH CORRELATION 

Distinct1000
Distinct (%)24.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.1044664 × 1011
Minimum1.0643483 × 108
Maximum9.9816735 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.6 KiB
2023-09-09T06:18:47.332778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0643483 × 108
5-th percentile4.6308004 × 1010
Q12.4712208 × 1011
median5.234143 × 1011
Q37.8276933 × 1011
95-th percentile9.4903125 × 1011
Maximum9.9816735 × 1011
Range9.9806091 × 1011
Interquartile range (IQR)5.3564726 × 1011

Descriptive statistics

Standard deviation3.0148029 × 1011
Coefficient of variation (CV)0.59062058
Kurtosis-1.2963502
Mean5.1044664 × 1011
Median Absolute Deviation (MAD)2.6747445 × 1011
Skewness-0.060387536
Sum2.0555686 × 1015
Variance9.0890366 × 1022
MonotonicityNot monotonic
2023-09-09T06:18:47.819165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.945507785 × 10116
 
0.1%
3.697327146 × 10116
 
0.1%
5.168595377 × 10116
 
0.1%
5.3052258 × 10116
 
0.1%
6.803811798 × 10116
 
0.1%
9.191467717 × 10116
 
0.1%
4.990108615 × 10116
 
0.1%
4.171436207 × 10116
 
0.1%
8.371353589 × 10116
 
0.1%
9.962911911 × 10116
 
0.1%
Other values (990) 3967
98.5%
ValueCountFrequency (%)
106434833 5
0.1%
199806849 6
0.1%
1418313629 3
0.1%
2039434588 4
0.1%
2716596742 5
0.1%
2891370518 4
0.1%
2920966534 5
0.1%
2999893985 6
0.1%
5137516005 2
 
< 0.1%
6647279318 5
0.1%
ValueCountFrequency (%)
9.981673484 × 10113
0.1%
9.971857143 × 10115
0.1%
9.962911911 × 10116
0.1%
9.954227176 × 10112
 
< 0.1%
9.947235405 × 10113
0.1%
9.940009864 × 10115
0.1%
9.937132356 × 10114
0.1%
9.935182464 × 10115
0.1%
9.891801355 × 10113
0.1%
9.888544788 × 10113
0.1%

customer_age
Real number (ℝ)

Distinct50
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.144276
Minimum18
Maximum69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.6 KiB
2023-09-09T06:18:48.273188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile20
Q129
median42
Q356
95-th percentile67
Maximum69
Range51
Interquartile range (IQR)27

Descriptive statistics

Standard deviation15.743324
Coefficient of variation (CV)0.36489948
Kurtosis-1.3017135
Mean43.144276
Median Absolute Deviation (MAD)14
Skewness0.031685929
Sum173742
Variance247.85225
MonotonicityNot monotonic
2023-09-09T06:18:48.744003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39 192
 
4.8%
29 171
 
4.2%
63 146
 
3.6%
60 144
 
3.6%
18 141
 
3.5%
69 138
 
3.4%
55 134
 
3.3%
51 128
 
3.2%
66 127
 
3.2%
47 126
 
3.1%
Other values (40) 2580
64.1%
ValueCountFrequency (%)
18 141
3.5%
19 55
 
1.4%
20 107
2.7%
21 62
1.5%
22 119
3.0%
23 80
2.0%
25 124
3.1%
26 91
2.3%
27 84
2.1%
28 81
2.0%
ValueCountFrequency (%)
69 138
3.4%
68 5
 
0.1%
67 106
2.6%
66 127
3.2%
65 36
 
0.9%
64 71
1.8%
63 146
3.6%
62 35
 
0.9%
61 122
3.0%
60 144
3.6%

customer_segment
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
Other
852 
Retail
829 
Student
809 
Premium
771 
Business
766 

Length

Max length8
Median length7
Mean length6.5612118
Min length5

Characters and Unicode

Total characters26422
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBusiness
2nd rowBusiness
3rd rowRetail
4th rowRetail
5th rowRetail

Common Values

ValueCountFrequency (%)
Other 852
21.2%
Retail 829
20.6%
Student 809
20.1%
Premium 771
19.1%
Business 766
19.0%

Length

2023-09-09T06:18:49.165883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T06:18:49.542684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
other 852
21.2%
retail 829
20.6%
student 809
20.1%
premium 771
19.1%
business 766
19.0%

Most occurring characters

ValueCountFrequency (%)
e 4027
15.2%
t 3299
12.5%
i 2366
9.0%
u 2346
8.9%
s 2298
8.7%
r 1623
 
6.1%
n 1575
 
6.0%
m 1542
 
5.8%
O 852
 
3.2%
h 852
 
3.2%
Other values (7) 5642
21.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22395
84.8%
Uppercase Letter 4027
 
15.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4027
18.0%
t 3299
14.7%
i 2366
10.6%
u 2346
10.5%
s 2298
10.3%
r 1623
7.2%
n 1575
 
7.0%
m 1542
 
6.9%
h 852
 
3.8%
a 829
 
3.7%
Other values (2) 1638
7.3%
Uppercase Letter
ValueCountFrequency (%)
O 852
21.2%
R 829
20.6%
S 809
20.1%
P 771
19.1%
B 766
19.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 26422
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4027
15.2%
t 3299
12.5%
i 2366
9.0%
u 2346
8.9%
s 2298
8.7%
r 1623
 
6.1%
n 1575
 
6.0%
m 1542
 
5.8%
O 852
 
3.2%
h 852
 
3.2%
Other values (7) 5642
21.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26422
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4027
15.2%
t 3299
12.5%
i 2366
9.0%
u 2346
8.9%
s 2298
8.7%
r 1623
 
6.1%
n 1575
 
6.0%
m 1542
 
5.8%
O 852
 
3.2%
h 852
 
3.2%
Other values (7) 5642
21.4%

card_type
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
American Express
901 
Visa
804 
MasterCard
795 
Rupay
773 
Other
754 

Length

Max length16
Median length10
Mean length8.2485721
Min length4

Characters and Unicode

Total characters33217
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMasterCard
2nd rowMasterCard
3rd rowAmerican Express
4th rowAmerican Express
5th rowAmerican Express

Common Values

ValueCountFrequency (%)
American Express 901
22.4%
Visa 804
20.0%
MasterCard 795
19.7%
Rupay 773
19.2%
Other 754
18.7%

Length

2023-09-09T06:18:49.950937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T06:18:50.333469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
american 901
18.3%
express 901
18.3%
visa 804
16.3%
mastercard 795
16.1%
rupay 773
15.7%
other 754
15.3%

Most occurring characters

ValueCountFrequency (%)
r 4146
 
12.5%
a 4068
 
12.2%
s 3401
 
10.2%
e 3351
 
10.1%
i 1705
 
5.1%
p 1674
 
5.0%
t 1549
 
4.7%
x 901
 
2.7%
m 901
 
2.7%
A 901
 
2.7%
Other values (13) 10620
32.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26593
80.1%
Uppercase Letter 5723
 
17.2%
Space Separator 901
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 4146
15.6%
a 4068
15.3%
s 3401
12.8%
e 3351
12.6%
i 1705
6.4%
p 1674
6.3%
t 1549
 
5.8%
x 901
 
3.4%
m 901
 
3.4%
n 901
 
3.4%
Other values (5) 3996
15.0%
Uppercase Letter
ValueCountFrequency (%)
A 901
15.7%
E 901
15.7%
V 804
14.0%
M 795
13.9%
C 795
13.9%
R 773
13.5%
O 754
13.2%
Space Separator
ValueCountFrequency (%)
901
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32316
97.3%
Common 901
 
2.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 4146
12.8%
a 4068
12.6%
s 3401
 
10.5%
e 3351
 
10.4%
i 1705
 
5.3%
p 1674
 
5.2%
t 1549
 
4.8%
x 901
 
2.8%
m 901
 
2.8%
A 901
 
2.8%
Other values (12) 9719
30.1%
Common
ValueCountFrequency (%)
901
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33217
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 4146
 
12.5%
a 4068
 
12.2%
s 3401
 
10.2%
e 3351
 
10.1%
i 1705
 
5.1%
p 1674
 
5.0%
t 1549
 
4.7%
x 901
 
2.7%
m 901
 
2.7%
A 901
 
2.7%
Other values (13) 10620
32.0%

customer_location
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
Delhi
507 
Chennai
472 
Pune
451 
Jaipur
419 
Hyderabad
419 
Other values (5)
1759 

Length

Max length9
Median length7
Mean length6.7698038
Min length4

Characters and Unicode

Total characters27262
Distinct characters29
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDelhi
2nd rowDelhi
3rd rowJaipur
4th rowJaipur
5th rowJaipur

Common Values

ValueCountFrequency (%)
Delhi 507
12.6%
Chennai 472
11.7%
Pune 451
11.2%
Jaipur 419
10.4%
Hyderabad 419
10.4%
Kolkata 406
10.1%
Bangalore 357
8.9%
Lucknow 349
8.7%
Mumbai 329
8.2%
Ahmedabad 318
7.9%

Length

2023-09-09T06:18:50.718551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T06:18:51.157046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
delhi 507
12.6%
chennai 472
11.7%
pune 451
11.2%
jaipur 419
10.4%
hyderabad 419
10.4%
kolkata 406
10.1%
bangalore 357
8.9%
lucknow 349
8.7%
mumbai 329
8.2%
ahmedabad 318
7.9%

Most occurring characters

ValueCountFrequency (%)
a 4220
15.5%
e 2524
 
9.3%
n 2101
 
7.7%
i 1727
 
6.3%
u 1548
 
5.7%
d 1474
 
5.4%
h 1297
 
4.8%
l 1270
 
4.7%
r 1195
 
4.4%
o 1112
 
4.1%
Other values (19) 8794
32.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23235
85.2%
Uppercase Letter 4027
 
14.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4220
18.2%
e 2524
10.9%
n 2101
9.0%
i 1727
 
7.4%
u 1548
 
6.7%
d 1474
 
6.3%
h 1297
 
5.6%
l 1270
 
5.5%
r 1195
 
5.1%
o 1112
 
4.8%
Other values (9) 4767
20.5%
Uppercase Letter
ValueCountFrequency (%)
D 507
12.6%
C 472
11.7%
P 451
11.2%
H 419
10.4%
J 419
10.4%
K 406
10.1%
B 357
8.9%
L 349
8.7%
M 329
8.2%
A 318
7.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 27262
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4220
15.5%
e 2524
 
9.3%
n 2101
 
7.7%
i 1727
 
6.3%
u 1548
 
5.7%
d 1474
 
5.4%
h 1297
 
4.8%
l 1270
 
4.7%
r 1195
 
4.4%
o 1112
 
4.1%
Other values (19) 8794
32.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4220
15.5%
e 2524
 
9.3%
n 2101
 
7.7%
i 1727
 
6.3%
u 1548
 
5.7%
d 1474
 
5.4%
h 1297
 
4.8%
l 1270
 
4.7%
r 1195
 
4.4%
o 1112
 
4.1%
Other values (19) 8794
32.3%

merchant_name
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
amazon gift cards
446 
amazon
425 
swiggy
420 
zomato
413 
rakuten
403 
Other values (13)
1920 

Length

Max length17
Median length15
Mean length9.7035014
Min length6

Characters and Unicode

Total characters39076
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowairtel
2nd rowamazon
3rd rowchai talks
4th rowfake_merchant_6
5th rowinstamart

Common Values

ValueCountFrequency (%)
amazon gift cards 446
11.1%
amazon 425
10.6%
swiggy 420
10.4%
zomato 413
10.3%
rakuten 403
10.0%
airtel 400
9.9%
instamart 380
9.4%
chai talks 359
8.9%
fake_merchant_9 95
 
2.4%
fake_merchant_1 84
 
2.1%
Other values (8) 602
14.9%

Length

2023-09-09T06:18:51.611952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
amazon 871
16.5%
gift 446
8.5%
cards 446
8.5%
swiggy 420
8.0%
zomato 413
7.8%
rakuten 403
7.6%
airtel 400
7.6%
instamart 380
7.2%
chai 359
6.8%
talks 359
6.8%
Other values (10) 781
14.8%

Most occurring characters

ValueCountFrequency (%)
a 6444
16.5%
t 3562
 
9.1%
m 2445
 
6.3%
n 2435
 
6.2%
r 2410
 
6.2%
e 2365
 
6.1%
i 2005
 
5.1%
o 1697
 
4.3%
s 1605
 
4.1%
c 1586
 
4.1%
Other values (22) 12522
32.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 35482
90.8%
Connector Punctuation 1562
 
4.0%
Space Separator 1251
 
3.2%
Decimal Number 781
 
2.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6444
18.2%
t 3562
 
10.0%
m 2445
 
6.9%
n 2435
 
6.9%
r 2410
 
6.8%
e 2365
 
6.7%
i 2005
 
5.7%
o 1697
 
4.8%
s 1605
 
4.5%
c 1586
 
4.5%
Other values (10) 8928
25.2%
Decimal Number
ValueCountFrequency (%)
9 95
12.2%
1 84
10.8%
0 83
10.6%
6 77
9.9%
7 77
9.9%
8 77
9.9%
4 75
9.6%
2 73
9.3%
5 72
9.2%
3 68
8.7%
Connector Punctuation
ValueCountFrequency (%)
_ 1562
100.0%
Space Separator
ValueCountFrequency (%)
1251
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 35482
90.8%
Common 3594
 
9.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6444
18.2%
t 3562
 
10.0%
m 2445
 
6.9%
n 2435
 
6.9%
r 2410
 
6.8%
e 2365
 
6.7%
i 2005
 
5.7%
o 1697
 
4.8%
s 1605
 
4.5%
c 1586
 
4.5%
Other values (10) 8928
25.2%
Common
ValueCountFrequency (%)
_ 1562
43.5%
1251
34.8%
9 95
 
2.6%
1 84
 
2.3%
0 83
 
2.3%
6 77
 
2.1%
7 77
 
2.1%
8 77
 
2.1%
4 75
 
2.1%
2 73
 
2.0%
Other values (2) 140
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6444
16.5%
t 3562
 
9.1%
m 2445
 
6.3%
n 2435
 
6.2%
r 2410
 
6.2%
e 2365
 
6.1%
i 2005
 
5.1%
o 1697
 
4.3%
s 1605
 
4.1%
c 1586
 
4.1%
Other values (22) 12522
32.0%
Distinct20
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
Singapore, Singapore
 
235
Paris, France
 
223
London, UK
 
222
Chennai
 
219
Dubai, UAE
 
219
Other values (15)
2909 

Length

Max length23
Median length18
Mean length11.530916
Min length4

Characters and Unicode

Total characters46435
Distinct characters42
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowToronto, Canada
2nd rowCape Town, South Africa
3rd rowSydney, Australia
4th rowParis, France
5th rowKolkata

Common Values

ValueCountFrequency (%)
Singapore, Singapore 235
 
5.8%
Paris, France 223
 
5.5%
London, UK 222
 
5.5%
Chennai 219
 
5.4%
Dubai, UAE 219
 
5.4%
Bangalore 213
 
5.3%
Ahmedabad 207
 
5.1%
Tokyo, Japan 202
 
5.0%
Cape Town, South Africa 202
 
5.0%
Hyderabad 201
 
5.0%
Other values (10) 1884
46.8%

Length

2023-09-09T06:18:52.004721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
singapore 470
 
6.5%
paris 223
 
3.1%
france 223
 
3.1%
london 222
 
3.1%
uk 222
 
3.1%
chennai 219
 
3.0%
dubai 219
 
3.0%
uae 219
 
3.0%
bangalore 213
 
2.9%
ahmedabad 207
 
2.9%
Other values (25) 4822
66.4%

Most occurring characters

ValueCountFrequency (%)
a 5464
 
11.8%
o 3478
 
7.5%
n 3345
 
7.2%
3232
 
7.0%
e 2854
 
6.1%
i 2800
 
6.0%
r 2670
 
5.7%
, 2066
 
4.4%
d 1601
 
3.4%
u 1358
 
2.9%
Other values (32) 17567
37.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 32999
71.1%
Uppercase Letter 8138
 
17.5%
Space Separator 3232
 
7.0%
Other Punctuation 2066
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5464
16.6%
o 3478
10.5%
n 3345
10.1%
e 2854
 
8.6%
i 2800
 
8.5%
r 2670
 
8.1%
d 1601
 
4.9%
u 1358
 
4.1%
p 1068
 
3.2%
y 973
 
2.9%
Other values (12) 7388
22.4%
Uppercase Letter
ValueCountFrequency (%)
S 1057
13.0%
A 1013
12.4%
C 818
10.1%
U 641
 
7.9%
T 601
 
7.4%
J 577
 
7.1%
L 423
 
5.2%
P 415
 
5.1%
K 411
 
5.1%
D 399
 
4.9%
Other values (8) 1783
21.9%
Space Separator
ValueCountFrequency (%)
3232
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2066
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41137
88.6%
Common 5298
 
11.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5464
 
13.3%
o 3478
 
8.5%
n 3345
 
8.1%
e 2854
 
6.9%
i 2800
 
6.8%
r 2670
 
6.5%
d 1601
 
3.9%
u 1358
 
3.3%
p 1068
 
2.6%
S 1057
 
2.6%
Other values (30) 15442
37.5%
Common
ValueCountFrequency (%)
3232
61.0%
, 2066
39.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46435
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5464
 
11.8%
o 3478
 
7.5%
n 3345
 
7.2%
3232
 
7.0%
e 2854
 
6.1%
i 2800
 
6.0%
r 2670
 
5.7%
, 2066
 
4.4%
d 1601
 
3.4%
u 1358
 
2.9%
Other values (32) 17567
37.8%

trans_id
Text

UNIQUE 

Distinct4027
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
2023-09-09T06:18:52.459767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters72486
Distinct characters62
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4027 ?
Unique (%)100.0%

Sample

1st rowAEBUCBZwjn20SdfGhM
2nd rowJxnOEJ8q7Y61CLUW3m
3rd rowsrGxuWPUWLuRlx6ACR
4th row5OmAOD8rKVnyn26d5c
5th rowSbZLIak4yqaYK1lbdY
ValueCountFrequency (%)
aebucbzwjn20sdfghm 1
 
< 0.1%
k7asoi49mzb37ylhne 1
 
< 0.1%
unuszi3gg65ju9t1gb 1
 
< 0.1%
ylstvtznaeir90pdqu 1
 
< 0.1%
srgxuwpuwlurlx6acr 1
 
< 0.1%
5omaod8rkvnyn26d5c 1
 
< 0.1%
sbzliak4yqayk1lbdy 1
 
< 0.1%
rahmtzc4udfmztrjnn 1
 
< 0.1%
f32xiqfpfke6y68dvb 1
 
< 0.1%
dpjfgellzxc4rxpg2m 1
 
< 0.1%
Other values (4017) 4017
99.8%
2023-09-09T06:18:53.227232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
p 1253
 
1.7%
Q 1234
 
1.7%
C 1233
 
1.7%
Y 1228
 
1.7%
3 1221
 
1.7%
2 1220
 
1.7%
a 1214
 
1.7%
r 1212
 
1.7%
N 1209
 
1.7%
j 1204
 
1.7%
Other values (52) 60258
83.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 30443
42.0%
Lowercase Letter 30323
41.8%
Decimal Number 11720
 
16.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
p 1253
 
4.1%
a 1214
 
4.0%
r 1212
 
4.0%
j 1204
 
4.0%
m 1198
 
4.0%
w 1193
 
3.9%
b 1188
 
3.9%
z 1188
 
3.9%
q 1185
 
3.9%
f 1180
 
3.9%
Other values (16) 18308
60.4%
Uppercase Letter
ValueCountFrequency (%)
Q 1234
 
4.1%
C 1233
 
4.1%
Y 1228
 
4.0%
N 1209
 
4.0%
J 1203
 
4.0%
D 1193
 
3.9%
F 1193
 
3.9%
P 1186
 
3.9%
S 1184
 
3.9%
H 1184
 
3.9%
Other values (16) 18396
60.4%
Decimal Number
ValueCountFrequency (%)
3 1221
10.4%
2 1220
10.4%
6 1196
10.2%
9 1176
10.0%
5 1166
9.9%
0 1161
9.9%
8 1157
9.9%
1 1150
9.8%
4 1142
9.7%
7 1131
9.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 60766
83.8%
Common 11720
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
p 1253
 
2.1%
Q 1234
 
2.0%
C 1233
 
2.0%
Y 1228
 
2.0%
a 1214
 
2.0%
r 1212
 
2.0%
N 1209
 
2.0%
j 1204
 
2.0%
J 1203
 
2.0%
m 1198
 
2.0%
Other values (42) 48578
79.9%
Common
ValueCountFrequency (%)
3 1221
10.4%
2 1220
10.4%
6 1196
10.2%
9 1176
10.0%
5 1166
9.9%
0 1161
9.9%
8 1157
9.9%
1 1150
9.8%
4 1142
9.7%
7 1131
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72486
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
p 1253
 
1.7%
Q 1234
 
1.7%
C 1233
 
1.7%
Y 1228
 
1.7%
3 1221
 
1.7%
2 1220
 
1.7%
a 1214
 
1.7%
r 1212
 
1.7%
N 1209
 
1.7%
j 1204
 
1.7%
Other values (52) 60258
83.1%

trans_approval_code
Text

UNIQUE 

Distinct4027
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
2023-09-09T06:18:53.868810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters24162
Distinct characters36
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4027 ?
Unique (%)100.0%

Sample

1st row0PQWLM
2nd row8BJL8D
3rd rowNK42HM
4th rowJLIPDS
5th rowGPTWZG
ValueCountFrequency (%)
0pqwlm 1
 
< 0.1%
w6znh9 1
 
< 0.1%
ap3biu 1
 
< 0.1%
2excuw 1
 
< 0.1%
nk42hm 1
 
< 0.1%
jlipds 1
 
< 0.1%
gptwzg 1
 
< 0.1%
ojprhp 1
 
< 0.1%
ycxycs 1
 
< 0.1%
3j8b9i 1
 
< 0.1%
Other values (4017) 4017
99.8%
2023-09-09T06:18:54.810292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
K 720
 
3.0%
3 717
 
3.0%
7 716
 
3.0%
B 708
 
2.9%
J 707
 
2.9%
O 703
 
2.9%
G 698
 
2.9%
Z 693
 
2.9%
E 689
 
2.9%
H 685
 
2.8%
Other values (26) 17126
70.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 17451
72.2%
Decimal Number 6711
 
27.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 720
 
4.1%
B 708
 
4.1%
J 707
 
4.1%
O 703
 
4.0%
G 698
 
4.0%
Z 693
 
4.0%
E 689
 
3.9%
H 685
 
3.9%
W 677
 
3.9%
V 675
 
3.9%
Other values (16) 10496
60.1%
Decimal Number
ValueCountFrequency (%)
3 717
10.7%
7 716
10.7%
9 682
10.2%
4 679
10.1%
6 674
10.0%
1 671
10.0%
0 671
10.0%
5 670
10.0%
2 617
9.2%
8 614
9.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 17451
72.2%
Common 6711
 
27.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
K 720
 
4.1%
B 708
 
4.1%
J 707
 
4.1%
O 703
 
4.0%
G 698
 
4.0%
Z 693
 
4.0%
E 689
 
3.9%
H 685
 
3.9%
W 677
 
3.9%
V 675
 
3.9%
Other values (16) 10496
60.1%
Common
ValueCountFrequency (%)
3 717
10.7%
7 716
10.7%
9 682
10.2%
4 679
10.1%
6 674
10.0%
1 671
10.0%
0 671
10.0%
5 670
10.0%
2 617
9.2%
8 614
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24162
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
K 720
 
3.0%
3 717
 
3.0%
7 716
 
3.0%
B 708
 
2.9%
J 707
 
2.9%
O 703
 
2.9%
G 698
 
2.9%
Z 693
 
2.9%
E 689
 
2.9%
H 685
 
2.8%
Other values (26) 17126
70.9%

trans_loc
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
Delhi
507 
Chennai
472 
Pune
451 
Jaipur
419 
Hyderabad
419 
Other values (5)
1759 

Length

Max length9
Median length7
Mean length6.7698038
Min length4

Characters and Unicode

Total characters27262
Distinct characters29
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDelhi
2nd rowDelhi
3rd rowJaipur
4th rowJaipur
5th rowJaipur

Common Values

ValueCountFrequency (%)
Delhi 507
12.6%
Chennai 472
11.7%
Pune 451
11.2%
Jaipur 419
10.4%
Hyderabad 419
10.4%
Kolkata 406
10.1%
Bangalore 357
8.9%
Lucknow 349
8.7%
Mumbai 329
8.2%
Ahmedabad 318
7.9%

Length

2023-09-09T06:18:55.249652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T06:18:55.642375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
delhi 507
12.6%
chennai 472
11.7%
pune 451
11.2%
jaipur 419
10.4%
hyderabad 419
10.4%
kolkata 406
10.1%
bangalore 357
8.9%
lucknow 349
8.7%
mumbai 329
8.2%
ahmedabad 318
7.9%

Most occurring characters

ValueCountFrequency (%)
a 4220
15.5%
e 2524
 
9.3%
n 2101
 
7.7%
i 1727
 
6.3%
u 1548
 
5.7%
d 1474
 
5.4%
h 1297
 
4.8%
l 1270
 
4.7%
r 1195
 
4.4%
o 1112
 
4.1%
Other values (19) 8794
32.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23235
85.2%
Uppercase Letter 4027
 
14.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4220
18.2%
e 2524
10.9%
n 2101
9.0%
i 1727
 
7.4%
u 1548
 
6.7%
d 1474
 
6.3%
h 1297
 
5.6%
l 1270
 
5.5%
r 1195
 
5.1%
o 1112
 
4.8%
Other values (9) 4767
20.5%
Uppercase Letter
ValueCountFrequency (%)
D 507
12.6%
C 472
11.7%
P 451
11.2%
H 419
10.4%
J 419
10.4%
K 406
10.1%
B 357
8.9%
L 349
8.7%
M 329
8.2%
A 318
7.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 27262
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4220
15.5%
e 2524
 
9.3%
n 2101
 
7.7%
i 1727
 
6.3%
u 1548
 
5.7%
d 1474
 
5.4%
h 1297
 
4.8%
l 1270
 
4.7%
r 1195
 
4.4%
o 1112
 
4.1%
Other values (19) 8794
32.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4220
15.5%
e 2524
 
9.3%
n 2101
 
7.7%
i 1727
 
6.3%
u 1548
 
5.7%
d 1474
 
5.4%
h 1297
 
4.8%
l 1270
 
4.7%
r 1195
 
4.4%
o 1112
 
4.1%
Other values (19) 8794
32.3%

trans_cat
Categorical

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
Travel
594 
Grocery
587 
Dining
586 
Retail
572 
Other
571 
Other values (2)
1117 

Length

Max length13
Median length9
Mean length7.4012913
Min length5

Characters and Unicode

Total characters29805
Distinct characters22
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEntertainment
2nd rowGrocery
3rd rowUtilities
4th rowGrocery
5th rowDining

Common Values

ValueCountFrequency (%)
Travel 594
14.8%
Grocery 587
14.6%
Dining 586
14.6%
Retail 572
14.2%
Other 571
14.2%
Entertainment 569
14.1%
Utilities 548
13.6%

Length

2023-09-09T06:18:56.111127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T06:18:56.486961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
travel 594
14.8%
grocery 587
14.6%
dining 586
14.6%
retail 572
14.2%
other 571
14.2%
entertainment 569
14.1%
utilities 548
13.6%

Most occurring characters

ValueCountFrequency (%)
e 4010
13.5%
i 3957
13.3%
t 3946
13.2%
r 2908
9.8%
n 2879
9.7%
a 1735
 
5.8%
l 1714
 
5.8%
T 594
 
2.0%
v 594
 
2.0%
c 587
 
2.0%
Other values (12) 6881
23.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25778
86.5%
Uppercase Letter 4027
 
13.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4010
15.6%
i 3957
15.4%
t 3946
15.3%
r 2908
11.3%
n 2879
11.2%
a 1735
6.7%
l 1714
6.6%
v 594
 
2.3%
c 587
 
2.3%
y 587
 
2.3%
Other values (5) 2861
11.1%
Uppercase Letter
ValueCountFrequency (%)
T 594
14.8%
G 587
14.6%
D 586
14.6%
R 572
14.2%
O 571
14.2%
E 569
14.1%
U 548
13.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 29805
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4010
13.5%
i 3957
13.3%
t 3946
13.2%
r 2908
9.8%
n 2879
9.7%
a 1735
 
5.8%
l 1714
 
5.8%
T 594
 
2.0%
v 594
 
2.0%
c 587
 
2.0%
Other values (12) 6881
23.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29805
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4010
13.5%
i 3957
13.3%
t 3946
13.2%
r 2908
9.8%
n 2879
9.7%
a 1735
 
5.8%
l 1714
 
5.8%
T 594
 
2.0%
v 594
 
2.0%
c 587
 
2.0%
Other values (12) 6881
23.1%

trans_currency
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
INR
3296 
CAD
 
136
EUR
 
135
AUD
 
124
Other
 
114
Other values (2)
 
222

Length

Max length5
Median length3
Mean length3.0566178
Min length3

Characters and Unicode

Total characters12309
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowINR
2nd rowEUR
3rd rowINR
4th rowINR
5th rowEUR

Common Values

ValueCountFrequency (%)
INR 3296
81.8%
CAD 136
 
3.4%
EUR 135
 
3.4%
AUD 124
 
3.1%
Other 114
 
2.8%
JPY 113
 
2.8%
USD 109
 
2.7%

Length

2023-09-09T06:18:56.973223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T06:18:57.350251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
inr 3296
81.8%
cad 136
 
3.4%
eur 135
 
3.4%
aud 124
 
3.1%
other 114
 
2.8%
jpy 113
 
2.8%
usd 109
 
2.7%

Most occurring characters

ValueCountFrequency (%)
R 3431
27.9%
I 3296
26.8%
N 3296
26.8%
D 369
 
3.0%
U 368
 
3.0%
A 260
 
2.1%
C 136
 
1.1%
E 135
 
1.1%
e 114
 
0.9%
r 114
 
0.9%
Other values (7) 790
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11853
96.3%
Lowercase Letter 456
 
3.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 3431
28.9%
I 3296
27.8%
N 3296
27.8%
D 369
 
3.1%
U 368
 
3.1%
A 260
 
2.2%
C 136
 
1.1%
E 135
 
1.1%
O 114
 
1.0%
J 113
 
1.0%
Other values (3) 335
 
2.8%
Lowercase Letter
ValueCountFrequency (%)
e 114
25.0%
r 114
25.0%
h 114
25.0%
t 114
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12309
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 3431
27.9%
I 3296
26.8%
N 3296
26.8%
D 369
 
3.0%
U 368
 
3.0%
A 260
 
2.1%
C 136
 
1.1%
E 135
 
1.1%
e 114
 
0.9%
r 114
 
0.9%
Other values (7) 790
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12309
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 3431
27.9%
I 3296
26.8%
N 3296
26.8%
D 369
 
3.0%
U 368
 
3.0%
A 260
 
2.1%
C 136
 
1.1%
E 135
 
1.1%
e 114
 
0.9%
r 114
 
0.9%
Other values (7) 790
 
6.4%

mcc
Real number (ℝ)

Distinct20
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4008.2754
Minimum4000
Maximum4019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.6 KiB
2023-09-09T06:18:57.694002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4000
5-th percentile4001
Q14003
median4008
Q34013
95-th percentile4017
Maximum4019
Range19
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.3969963
Coefficient of variation (CV)0.0013464634
Kurtosis-1.1717825
Mean4008.2754
Median Absolute Deviation (MAD)5
Skewness0.2417021
Sum16141325
Variance29.127569
MonotonicityNot monotonic
2023-09-09T06:18:58.039669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
4005 415
 
10.3%
4003 361
 
9.0%
4001 343
 
8.5%
4014 305
 
7.6%
4006 304
 
7.5%
4016 301
 
7.5%
4009 266
 
6.6%
4010 242
 
6.0%
4002 237
 
5.9%
4012 185
 
4.6%
Other values (10) 1068
26.5%
ValueCountFrequency (%)
4000 114
 
2.8%
4001 343
8.5%
4002 237
5.9%
4003 361
9.0%
4004 108
 
2.7%
4005 415
10.3%
4006 304
7.5%
4007 61
 
1.5%
4008 183
4.5%
4009 266
6.6%
ValueCountFrequency (%)
4019 69
 
1.7%
4018 79
 
2.0%
4017 124
3.1%
4016 301
7.5%
4015 72
 
1.8%
4014 305
7.6%
4013 162
4.0%
4012 185
4.6%
4011 96
 
2.4%
4010 242
6.0%

trans_amount
Real number (ℝ)

HIGH CORRELATION 

Distinct3348
Distinct (%)83.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10876.342
Minimum1.08
Maximum49924.762
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.6 KiB
2023-09-09T06:18:58.463461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.08
5-th percentile9.6650001
Q141
median2390.76
Q319626.96
95-th percentile43557.77
Maximum49924.762
Range49923.682
Interquartile range (IQR)19585.96

Descriptive statistics

Standard deviation15069.316
Coefficient of variation (CV)1.3855132
Kurtosis0.058913674
Mean10876.342
Median Absolute Deviation (MAD)2368.76
Skewness1.2287814
Sum43799031
Variance2.2708429 × 108
MonotonicityNot monotonic
2023-09-09T06:18:58.917500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17 21
 
0.5%
41 20
 
0.5%
7 20
 
0.5%
30 19
 
0.5%
3 19
 
0.5%
42 18
 
0.4%
45 17
 
0.4%
46 17
 
0.4%
28 17
 
0.4%
23 17
 
0.4%
Other values (3338) 3842
95.4%
ValueCountFrequency (%)
1.080000043 1
< 0.1%
1.129999995 1
< 0.1%
1.200000048 1
< 0.1%
1.389999986 1
< 0.1%
1.419999957 1
< 0.1%
1.470000029 1
< 0.1%
1.629999995 1
< 0.1%
1.659999967 1
< 0.1%
1.679999948 1
< 0.1%
1.870000005 1
< 0.1%
ValueCountFrequency (%)
49924.76172 1
< 0.1%
49918 1
< 0.1%
49909 1
< 0.1%
49881 1
< 0.1%
49821 1
< 0.1%
49780 1
< 0.1%
49753.12109 1
< 0.1%
49751 1
< 0.1%
49748 1
< 0.1%
49744.96875 1
< 0.1%
Distinct4022
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
Minimum2023-08-10 06:20:36
Maximum2023-09-09 06:11:57
2023-09-09T06:18:59.372577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:59.810073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
Credit Card
2036 
Debit Card
1991 

Length

Max length11
Median length11
Mean length10.505587
Min length10

Characters and Unicode

Total characters42306
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCredit Card
2nd rowCredit Card
3rd rowDebit Card
4th rowCredit Card
5th rowCredit Card

Common Values

ValueCountFrequency (%)
Credit Card 2036
50.6%
Debit Card 1991
49.4%

Length

2023-09-09T06:19:00.218346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T06:19:00.517273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
card 4027
50.0%
credit 2036
25.3%
debit 1991
24.7%

Most occurring characters

ValueCountFrequency (%)
C 6063
14.3%
r 6063
14.3%
d 6063
14.3%
e 4027
9.5%
i 4027
9.5%
t 4027
9.5%
4027
9.5%
a 4027
9.5%
D 1991
 
4.7%
b 1991
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30225
71.4%
Uppercase Letter 8054
 
19.0%
Space Separator 4027
 
9.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 6063
20.1%
d 6063
20.1%
e 4027
13.3%
i 4027
13.3%
t 4027
13.3%
a 4027
13.3%
b 1991
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
C 6063
75.3%
D 1991
 
24.7%
Space Separator
ValueCountFrequency (%)
4027
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 38279
90.5%
Common 4027
 
9.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 6063
15.8%
r 6063
15.8%
d 6063
15.8%
e 4027
10.5%
i 4027
10.5%
t 4027
10.5%
a 4027
10.5%
D 1991
 
5.2%
b 1991
 
5.2%
Common
ValueCountFrequency (%)
4027
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42306
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 6063
14.3%
r 6063
14.3%
d 6063
14.3%
e 4027
9.5%
i 4027
9.5%
t 4027
9.5%
4027
9.5%
a 4027
9.5%
D 1991
 
4.7%
b 1991
 
4.7%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
PIN
2026 
Biometric
2001 

Length

Max length9
Median length3
Mean length5.9813757
Min length3

Characters and Unicode

Total characters24087
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBiometric
2nd rowPIN
3rd rowPIN
4th rowPIN
5th rowPIN

Common Values

ValueCountFrequency (%)
PIN 2026
50.3%
Biometric 2001
49.7%

Length

2023-09-09T06:19:00.876650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T06:19:01.175397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
pin 2026
50.3%
biometric 2001
49.7%

Most occurring characters

ValueCountFrequency (%)
i 4002
16.6%
P 2026
8.4%
I 2026
8.4%
N 2026
8.4%
B 2001
8.3%
o 2001
8.3%
m 2001
8.3%
e 2001
8.3%
t 2001
8.3%
r 2001
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16008
66.5%
Uppercase Letter 8079
33.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4002
25.0%
o 2001
12.5%
m 2001
12.5%
e 2001
12.5%
t 2001
12.5%
r 2001
12.5%
c 2001
12.5%
Uppercase Letter
ValueCountFrequency (%)
P 2026
25.1%
I 2026
25.1%
N 2026
25.1%
B 2001
24.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 24087
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4002
16.6%
P 2026
8.4%
I 2026
8.4%
N 2026
8.4%
B 2001
8.3%
o 2001
8.3%
m 2001
8.3%
e 2001
8.3%
t 2001
8.3%
r 2001
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24087
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 4002
16.6%
P 2026
8.4%
I 2026
8.4%
N 2026
8.4%
B 2001
8.3%
o 2001
8.3%
m 2001
8.3%
e 2001
8.3%
t 2001
8.3%
r 2001
8.3%

trans_status
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.6 KiB
Transfer
1360 
Payment
1354 
Purchase
1313 

Length

Max length8
Median length8
Mean length7.6637696
Min length7

Characters and Unicode

Total characters30862
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTransfer
2nd rowTransfer
3rd rowTransfer
4th rowPurchase
5th rowTransfer

Common Values

ValueCountFrequency (%)
Transfer 1360
33.8%
Payment 1354
33.6%
Purchase 1313
32.6%

Length

2023-09-09T06:19:01.505478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T06:19:01.833599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
transfer 1360
33.8%
payment 1354
33.6%
purchase 1313
32.6%

Most occurring characters

ValueCountFrequency (%)
r 4033
13.1%
a 4027
13.0%
e 4027
13.0%
n 2714
8.8%
s 2673
8.7%
P 2667
8.6%
T 1360
 
4.4%
f 1360
 
4.4%
y 1354
 
4.4%
m 1354
 
4.4%
Other values (4) 5293
17.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26835
87.0%
Uppercase Letter 4027
 
13.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 4033
15.0%
a 4027
15.0%
e 4027
15.0%
n 2714
10.1%
s 2673
10.0%
f 1360
 
5.1%
y 1354
 
5.0%
m 1354
 
5.0%
t 1354
 
5.0%
u 1313
 
4.9%
Other values (2) 2626
9.8%
Uppercase Letter
ValueCountFrequency (%)
P 2667
66.2%
T 1360
33.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 30862
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 4033
13.1%
a 4027
13.0%
e 4027
13.0%
n 2714
8.8%
s 2673
8.7%
P 2667
8.6%
T 1360
 
4.4%
f 1360
 
4.4%
y 1354
 
4.4%
m 1354
 
4.4%
Other values (4) 5293
17.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30862
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 4033
13.1%
a 4027
13.0%
e 4027
13.0%
n 2714
8.8%
s 2673
8.7%
P 2667
8.6%
T 1360
 
4.4%
f 1360
 
4.4%
y 1354
 
4.4%
m 1354
 
4.4%
Other values (4) 5293
17.2%

fake
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
False
2522 
True
1505 
ValueCountFrequency (%)
False 2522
62.6%
True 1505
37.4%
2023-09-09T06:19:02.193879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-09-09T06:18:28.379401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:16:49.698705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:17:57.796974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:14.007701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:42.167756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:17:16.173979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:13.087273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:27.375246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:42.520437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:17:26.816961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:13.381050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:27.692552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:42.925695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:17:40.085509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:13.698153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T06:18:28.041125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-09-09T06:19:02.475128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
card_numbercustomer_agemcctrans_amountcustomer_segmentcard_typecustomer_locationmerchant_namemerchant_locationtrans_loctrans_cattrans_currencytrans_payment_methodtrans_verify_methodtrans_statusfake
card_number1.0000.0170.041-0.0070.8680.8680.8680.0000.0000.8680.0000.0580.0780.1110.0860.000
customer_age0.0171.000-0.018-0.0100.0720.0690.0760.0000.0000.0760.0230.0110.0140.0000.0070.000
mcc0.041-0.0181.000-0.0210.0190.0130.0000.0000.0120.0000.0250.0110.0370.0220.0090.037
trans_amount-0.007-0.010-0.0211.0000.0000.0220.0000.2210.0300.0000.0340.2570.0000.0070.0180.999
customer_segment0.8680.0720.0190.0001.0000.0420.0910.0000.0140.0910.0110.0000.0140.0400.0000.000
card_type0.8680.0690.0130.0220.0421.0000.0920.0000.0320.0920.0130.0000.0000.0000.0110.000
customer_location0.8680.0760.0000.0000.0910.0921.0000.0000.0001.0000.0000.0270.0000.0350.0200.000
merchant_name0.0000.0000.0000.2210.0000.0000.0001.0000.0190.0000.0200.1630.0000.0000.0000.633
merchant_location0.0000.0000.0120.0300.0140.0320.0000.0191.0000.0000.0000.0000.0000.0120.0410.021
trans_loc0.8680.0760.0000.0000.0910.0921.0000.0000.0001.0000.0000.0270.0000.0350.0200.000
trans_cat0.0000.0230.0250.0340.0110.0130.0000.0200.0000.0001.0000.0150.0000.0120.0210.061
trans_currency0.0580.0110.0110.2570.0000.0000.0270.1630.0000.0270.0151.0000.0000.0000.0000.608
trans_payment_method0.0780.0140.0370.0000.0140.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.000
trans_verify_method0.1110.0000.0220.0070.0400.0000.0350.0000.0120.0350.0120.0000.0001.0000.0240.021
trans_status0.0860.0070.0090.0180.0000.0110.0200.0000.0410.0200.0210.0000.0000.0241.0000.000
fake0.0000.0000.0370.9990.0000.0000.0000.6330.0210.0000.0610.6080.0000.0210.0001.000

Missing values

2023-09-09T06:18:43.522139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-09-09T06:18:44.574784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

cardholder_namecard_numbercustomer_agecustomer_segmentcard_typecustomer_locationmerchant_namemerchant_locationtrans_idtrans_approval_codetrans_loctrans_cattrans_currencymcctrans_amounttrans_datetrans_payment_methodtrans_verify_methodtrans_statusfake
0Kaitlin Edwards10331358412918BusinessMasterCardDelhiairtelToronto, CanadaAEBUCBZwjn20SdfGhM0PQWLMDelhiEntertainmentINR40112237.0000002023-08-24 20:00:22Credit CardBiometricTransferFalse
1Kaitlin Edwards10331358412918BusinessMasterCardDelhiamazonCape Town, South AfricaJxnOEJ8q7Y61CLUW3m8BJL8DDelhiGroceryEUR400713546.0000002023-08-24 20:00:47Credit CardPINTransferTrue
2Kyle Armstrong79130050730164RetailAmerican ExpressJaipurchai talksSydney, AustraliasrGxuWPUWLuRlx6ACRNK42HMJaipurUtilitiesINR400136.4199982023-08-25 09:09:36Debit CardPINTransferFalse
3Kyle Armstrong79130050730164RetailAmerican ExpressJaipurfake_merchant_6Paris, France5OmAOD8rKVnyn26d5cJLIPDSJaipurGroceryINR401713674.3896482023-08-25 22:01:21Credit CardPINPurchaseTrue
4Kyle Armstrong79130050730164RetailAmerican ExpressJaipurinstamartKolkataSbZLIak4yqaYK1lbdYGPTWZGJaipurDiningEUR400947821.5781252023-08-29 14:54:21Credit CardPINTransferTrue
5Stephanie Washington97389178744542StudentRupayMumbaiamazon gift cardsNew York City, USArAHmTzC4udFmzTRJNNOJPRHPMumbaiRetailINR4012705.0000002023-08-13 09:31:36Debit CardBiometricTransferFalse
6Stephanie Washington97389178744542StudentRupayMumbaichai talksParis, Francef32xiqFPFkE6y68dvBYCXYCSMumbaiTravelOther40018145.0000002023-08-19 22:34:05Debit CardBiometricPurchaseTrue
7Jennifer Russell DVM12561176527729RetailAmerican ExpressHyderabadamazonLondon, UKdpJFgELLzXC4rxpg2m3J8B9IHyderabadRetailINR400624.3700012023-08-16 13:17:11Credit CardBiometricPurchaseFalse
8Jennifer Russell DVM12561176527729RetailAmerican ExpressHyderabadfake_merchant_4Dubai, UAEXsccxz3k7php9ByBkN79WPCUHyderabadTravelAUD400519232.5898442023-09-02 03:07:06Credit CardPINPurchaseTrue
9Jennifer Russell DVM12561176527729RetailAmerican ExpressHyderabadzomatoHyderabaduylbQmwaIFCYmqObvILDC09WHyderabadOtherINR400021779.0000002023-09-04 23:38:51Debit CardPINPaymentTrue
cardholder_namecard_numbercustomer_agecustomer_segmentcard_typecustomer_locationmerchant_namemerchant_locationtrans_idtrans_approval_codetrans_loctrans_cattrans_currencymcctrans_amounttrans_datetrans_payment_methodtrans_verify_methodtrans_statusfake
4017Angela Beltran28145587245739OtherOtherLucknowairtelDelhiaQJGKbGoZ0yKJzAr763BAFQULucknowTravelINR40022803.0000002023-09-07 00:35:10Debit CardPINPurchaseFalse
4018Angela Beltran28145587245739OtherOtherLucknowrakutenLondon, UKpfLm74WaQYgxALbQZ3643BCALucknowEntertainmentINR40053414.0000002023-09-08 18:48:04Debit CardPINTransferFalse
4019Angela Beltran28145587245739OtherOtherLucknowzomatoChennaiagljIfFQunE4tyLEfPIJ0TWXLucknowGroceryCAD401642420.0000002023-09-08 20:25:43Credit CardBiometricPurchaseTrue
4020Angela Beltran28145587245739OtherOtherLucknowrakutenLondon, UKOSuNrJPA1NEO3gLkKZCN7W7RLucknowDiningINR400542573.3789062023-09-08 21:51:53Credit CardPINPaymentTrue
4021Ashley Garrett79455077848529OtherVisaBangaloreswiggyNew York City, USAC01WEuQem3LaesQpFiIWNQOCBangaloreRetailINR400739.0000002023-08-17 23:03:37Credit CardPINTransferFalse
4022Ashley Garrett79455077848529OtherVisaBangalorechai talksDubai, UAEVFHZYPdizRqt4RGlyJ0751NGBangaloreUtilitiesINR401019.0000002023-08-22 13:42:45Debit CardBiometricPaymentFalse
4023Ashley Garrett79455077848529OtherVisaBangalorezomatoRio de Janeiro, BrazilNhl1drXCajpf56YrtdCFNWYVBangaloreOtherINR401535.7300002023-09-05 00:14:46Credit CardBiometricPaymentFalse
4024Ashley Garrett79455077848529OtherVisaBangaloreinstamartJaipurKXmMkpFK0bkHgDIL6CE14GSPBangaloreRetailINR401648.0000002023-09-06 16:57:46Credit CardPINPaymentFalse
4025Ashley Garrett79455077848529OtherVisaBangaloreairtelBangaloreQdFkY0E8J0pW6GzhmMZWAD4SBangaloreEntertainmentINR400924696.0000002023-09-08 15:35:16Debit CardBiometricTransferTrue
4026Ashley Garrett79455077848529OtherVisaBangalorerakutenBangaloretXilEM5m3NxI6WYIq1JS6JHIBangaloreUtilitiesUSD40069280.5097662023-09-08 21:10:55Credit CardBiometricPurchaseTrue